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Network models are widely used to represent relational informa- 
tion among interacting units and the structural implications of these 
relations. Recently, social network studies have focused a great deal 
of attention on random graph models of networks whose nodes rep- 
resent individual social actors and whose edges represent a specified 
relationship between the actors. 

Most inference for social network models assumes that the pres- 
ence or absence of all possible links is observed, that the information is 
completely reliable, and that there are no measurement (e.g., record- 
ing) errors. This is clearly not true in practice, as much network data 
is collected though sample surveys. In addition even if a census of 
a population is attempted, individuals and links between individuals 
are missed (i.e., do not appear in the recorded data). 

In this paper we develop the conceptual and computational the- 
ory for inference based on sampled network information. We first 
review forms of network sampling designs used in practice. We con- 
sider inference from the likelihood framework, and develop a typology 
of network data that reflects their treatment within this frame. We 
then develop inference for social network models based on informa- 
tion from adaptive network designs. 

We motivate and illustrate these ideas by analyzing the effect of 
link-tracing sampling designs on a collaboration network. 

1. Introduction. Networks are a useful device to represent "relational 
data," that is, data with properties beyond the attributes of the individuals 
(nodes) involved. Relational data arise in many fields and network models 
are a natural approach to representing the patterns of the relations between 
nodes. Networks can be used to describe such diverse ideas as the behav- 
ior of epidemics, the interconnectedness of corporate boards, and networks 
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of genetic regulatory interactions. In social network applications, the nodes 
in a graph typically represent individuals, and the ties (edges) represent a 
specified relationship between individuals. Nodes can also be used to repre- 
sent larger social units (groups, families, organizations), objects (airports, 
servers, locations) or abstract entities (concepts, texts, tasks, random vari- 
ables). We consider here stochastic models for such graphs. These models 
attempt to represent the stochastic mechanisms that produce relational ties, 
and the complex dependencies thus induced. 

Social network data typically consist of a set of n actors and a relational 
tie random variable, Yij, measured on each possible ordered pair of actors, 
(z,i), i,j = 1) • • • ,Ti,i 7^ j. In the most simple cases, Yij is a dichotomous 
variable, indicating the presence or absence of some relation of interest, 
such as friendship, collaboration, transmission of information or disease, etc. 
The data are often represented by an n x n sociomatrix Y, with diagonal 
elements, representing self-ties, treated as structural zeros. In the case of 
binary relations, the data can also be thought of as a graph in which the 
nodes are actors and the edge set is '■ Y^ = 1}. For many networks the 

relations are undirected in the sense that Y^j — ^Yji^ ^? 3 — 1, . . . , tt*. 

In the application in this paper we consider a network formed from the 
collaborative working relations between n = 36 partners in a New England 
law firm [Lazega (2001)]. We focus on the undirected relation where a tie 
is said to exist between two partners if and only if both indicate that they 
collaborate with the other. The scientific objective is to explain the observed 
structural pattern of collaborative ties as a function of nodal and relational 
attributes. The relational data is supplemented by four actor attributes: 
seniority (the rank number of chronological entry into the firm divided by 
36), practice (there are two possible values, litigation = and corporate 
law = 1), gender (3 of the 36 lawyers are female) and office (there are three 
different offices in three different cities each of different size). 

For large or hard-to-find populations of actors it is difficult to obtain 
information on all actors and all relational ties. As a result, various survey 
sampling strategies and methods are applied. Some of these methods make 
use of network information revealed by earlier stages of sampling to guide 
later sampling. These adaptive designs allow for more efficient sampling than 
conventional sampling designs. We consider such designs in Section 2. 

Most of the work presented here considers the network over the set of 
actors to be the realization of a stochastic process. We seek to model that 
process. An alternative is to view the network as a fixed structure about 
which we wish to make inference based on partial observation. 

In this paper we develop a theoretical framework for inference from net- 
work data that are partially-observed due to sampling. This work extends 
the fundamental work of Thompson and Frank (2000). For purposes of pre- 
sentation, we focus on the relational data itself and suppress reference to 
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covariates of the nodes. This more general situation is dealt with in 
Handcock and Gile (2007). 

In Section 2 we present a conceptual framework for network sampling. 
We extend this framework in Section 3 to focus on inference from sampled 
network data. We first consider the limitations of design-based inference in 
this setting, then focus on likelihood-based inference. Section 4 presents the 
rich Exponential Family Random Graph Model (ERGM) family of models 
that has been applied to complete network data. Section 5 presents a study 
of the effect of sampling from a known complete network of law firm collab- 
orations. Finally, in Section 6, we discuss the overall ramifications for the 
modeling of social networks with sampled data and note some extensions. 

2. Network sampling design. In this section we consider the conceptual 
and computational theory of network sampling. 

There is a substantial literature on network sampling designs. Our devel- 
opment here follows Thompson and Seber (1996) and Thompson and Frank 
(2000). Let y denote the set of possible networks on the n actors. Note that 
in most network samples, the unit of sampling is the actor or node, while 
the unit of analysis is typically the dyad. Let D be the nxn random binary 
matrix indicating if the corresponding element of Y was sampled or not. The 
value of the i,jth. element is if the ordered pair was not sampled and 
1 if the element was sampled. Denote the sample space of D by T>. We shall 
refer to the probability distribution of D as the sampling design. The sam- 
pling design is often related to the structure of the graph and a parameter 
tjj G Vf, so we posit a model for it. Specifically, let P(D = d\Y = y;tp) denote 
the probability of selecting sample d given a network y and parameter ip. 

Under many sampling designs the set of sampled dyads is determined 
by the set of sampled nodes. Let S represent a binary random n-vector 
indicating a subset of the nodes, where the zth element is 1 if the ith node 
is part of the set, and is otherwise. We often consider situations where D 
is determined by some S which is itself a result of a sample design denoted 
by P(S\Y,tp). For example, consider an undirected network where the set of 
observed dyads are those that are incident on at least one of the sampled 
nodes. In this case D = Sol + loS — S o S, where 1 is the binary n-vector 
of Is. A primary example of this is where people are sampled and surveyed 
to determine all their edges. 

We introduce further notation to allow us to refer to the observed and 
unobserved portions of the relational structures. Denote the observed part of 
the complete graph Y by Y b s = {Yij : Dij = 1} and the unobserved part by 
^mis = {Yij : = 0}. The full observed data is then {Y b s ,-D}, in contrast 
to the complete data: {Y Q ^ S , Y m { s , D}. We will write the complete graph Y = 
{Y bs,Y m [ s }. In addition, we make the convention that undefined numbers 
act as identity elements in addition and multiplication. So a number x plus 
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or multiplied by an undefined number y is x, and hence Y = Y Q \, S + Y m i s . For 
a given network y G y, denote the corresponding data as {y bs,^} and the 
other elements by their lower-case versions y = y Q b s + ymis- Finally denote 
y(yobs) = {v : y bs + v € y}, that is the set of possible unobserved elements 
which together with y Q ^ s result in valid network. The set y Q b s + y{y obs) is 
then the restriction of y to y bs- 

A sampling design is conventional if it does not use information collected 
during the survey to direct subsequent sampling of individuals (e.g., net- 
work census and ego-centric designs). Specifically, a design is conventional 
if P(D = d\Y = y;ip) = P(D = d\ip) Vy £ y. A simple example of a con- 
ventional sampling design for networks is an ego-centric design, consist- 
ing of a simple random sampling of a subset of the actors, followed by 
complete observation of the dyads originating from those actors. A com- 
plete census of the network is another. More complex examples include de- 
signs using probability sampling of pairs and auxiliary variables. Alterna- 
tively, we call a sampling design adaptive if it uses information collected 
during the survey to direct subsequent sampling, but the sampling design 
depends only on the observed data. Specifically, a design is adaptive if: 
P{D = d\Y = y^) = P(D = (f | Yobs = yobs, VO € y Q bs + 3^2/obs)- Hence a 
design can be adaptive for a given y bs (rather than all possible observed 
data) , although most common such designs are adaptive for all possible data 
observed under them. Conventional designs can be considered to be special 
cases of adaptive designs. 

Note that adaptive sampling designs satisfy 

(2.1) P(D = d | y o b s , Ymis, = P{D = d\Y ohs , VO, 

a condition called "missing at random" by Rubin (1976) in the context of 
missing data. Note that this is a bit misleading — it does not say that the 
propensity to be observed is unrelated to the unobserved portions of the 
network, but that this relationship can be explained by the data that are 
observed. The observed part of the data are often vital to equality (2.1). 
Hence adaptive designs are essentially those for which the unobserved dyads 
are missing at random. 

Denote by [a] the vector-valued function that is 1 if the corresponding 
element of the vector a is logically true, and otherwise. Let a x b be the 
elementwise product of the column vector a and the column vector b and 
a ■ b be the scalar product 'Ylj a j^j- Let a o b be the outer product matrix 
with ijth. element aibj. If y is a matrix and b a vector let y ■ b be the column 
vector with ith element ^2jyjibj. 

2.1. Some adaptive designs for undirected networks. We now consider 
several examples of adaptive designs for undirected networks. 
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2.1.1. Example: Ego-centric design. Consider a simple ego-centric de- 
sign: 

1. Select individuals at random, each with probability ip. 

2. Observe all dyads involving the selected individuals (i.e., dyads with at 
least one of the selected individuals as one of the pair of actors). 

The sampling design can be determined for this case. First note that 

P(Ai = i|y,VO = i-(i-V0 2 Vi^i. 

This, however, does not give the joint distribution of D. Let S be the binary 
n-vector where 1 and indicate that the corresponding individual has been 
selected, or not, respectively. Within this design, S is determined by D (i.e., 
S=[Dl = (n-l)l]). Then P(S = s\Y,ip) = V> 1-S (l - i/)) 7 ^ 1 '', se{0,l} n . If 
the ith element of S is 1 then all elements in the ith row and column of D 
are 1. Dij = if and only if both the ith and jth elements of S are both 0. 
Hence the probability distribution of D is 

P(D = d\Y, tp) = ip hs (l - V0 n_1 ' s 

for 

d=los + sol- sos, sG{0,l} n . 

Note that the distribution does not depend on Y, and is therefore conven- 
tional. 

2.1.2. Example: One-wave link-tracing design. We refer to any sample 
in which subsequent nodes are enrolled based on their observed relations 
with other sampled nodes as a link-tracing design. Consider the one-wave 
link-tracing design specified as follows: 

1. Select individuals at random, each with probability ip. 

2. Observe all dyads involving the selected individuals. 

3. Identify all individuals reported to have at least one relation with the 
initial sample, and select them with probability 1. 

4. Observe all dyads involving the newly selected individuals. 

Let So denote the indicator vector for the initial sample and Si the in- 
dicator for the added individuals not in the initial sample. Then the whole 
sample of individuals is S = So + Si . As in the undirected ego-centric design, 
D = loS + Sol-SoS. Note that Si = [YS x (1 - S ) > 0] is derivable 
from Sq, and Y. Hence 

P(D = d\Y,i;)= ^(i-vo™" 1 ' 30 

s : so+[Vs x(l-s )>0]=s 

for 



d=l o s + s ol — sos, 



sG{0,l} 
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2.1.3. Example: Multi-wave link-tracing design. Consider a multi-wave 
link-tracing design in which the complete set of partners of the fcth wave 
are enrolled, that is, the link-tracing process described above is carried out 
k times. If k is fixed in advance this is called k-wave link-tracing. 

Let Sq denote the indicator for the initial sample, S\ the indicator for 
the added individuals in the first wave not in the initial sample, . . . , Sk the 
indicator for the added individuals in wave k not in the prior samples. Then 

the whole sample of individuals is S = So + Si H h Sf-. As in the ego-centric 

design D = loS + Sol-SoS. Note that S m = [YS m ^i x (1 - £™ ^ S t ) > 
0], m = 1, . . . , k is derivable from Sq and Y. Then 

P{D = d\Y,^)= ^(l-VO™" 1 " 

so: so+sH hSfc=s 

for d = 1 o s + s o 1 - s o s, s G {0, l} n . Here S m = [YS m -i x (1 - JX" 1 S t ) > 
0] = [l^bs'S'm-i x (1 — S^Lo 1 St) > 0], m = 1, . . . , k so that the individuals 
selected in the successive waves only depend on the observed part of the 
graph, and not on the unobserved portions of the graph. Clearly, this is also 
true for one-wave link-tracing as a simple case of fe-wave link-tracing. Note 
that it may be possible that S m = for some m < k, so that subsequent 
waves do not increase the sample size (i.e., Sk = 0). A variant of the /c-wave 
link-tracing design is the saturated link-tracing design, in which sampling 
continues until wave m, such that S m = 0. We interpret k as the bound 
on the number of waves sampled imposed by the sampling design. Since 
saturated link-tracing does not restrict the number of waves sampled, we 
represent it by setting k = oo. 

2.2. Some adaptive designs for directed networks. We can also consider 
variants of these adaptive designs for directed networks. 

2.2.1. Example: Ego-centric design. Consider a simple ego-centric de- 
sign: 

1. Select individuals at random, each with probability ip. 

2. Observe all directed dyads originating at the selected individuals. 

As before, the sampling design can be determined for this case. Since a 
directed dyad is observed only if its tail node is sampled, 

P(D ij = l\Y,1>) = i/> Vi^j 

and D = So o 1 . Hence the probability distribution of D is 

P(D = d\Y, ij>) = ip hs (l - V0 n ~ 1,s 

for d = s o 1, s G {0, l} n and the distribution does not depend on Y. As in 
the undirected case, this design is therefore conventional. 
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2.2.2. Example: One-wave link-tracing design. Consider a one-wave link- 
tracing design on a directed network specified as follows: 

1. Select individuals at random, each with probability tp. 

2. Observe all directed dyads originating at the selected individuals. 

3. Identify all individuals receiving an arc from a member of the initial 
sample, and select them with probability 1. 

4. Observe all directed dyads originating at the newly selected individuals. 

Let So denote the indicator vector for the initial sample and Si the in- 
dicator for the added individuals not in the initial sample. Then the whole 
sample of individuals is S = So + Si . As in the ego-centric design D = S o 1 
and 

P(D = d\Y,-ifj) = ^(l-V) 71-1 ' 30 

s : s +[Ysox(l-s )>0]=s 

for d = sol, S £{0,l} n . 

2.2.3. Example: Multi-wave link-tracing design. Consider a directed ver- 
sion of the multi-wave link-tracing design in which the complete set of 
out-partners of the A;th wave are enrolled. The whole sample of individ- 
uals is S = S + Si + • • • + S k . And S m = [Y ■ S m _i x (1 - YZ^ S t ) > 0], 
m = 1, . . . , k is derivable from So and Y. Then 

P(D = d\Y,ip)= Yl ^ 1 - S0 (l-tp) n - 1 - S0 

so:sq+si-\ hs k =s 

for d = s o 1, s G {0, l} n , where we note that S m = [Y ■ S m _i x (1 — Y^t=o &t) > 
0] = [5^bs ' S m _i x (1 — YlT=o St) > 0], m = 1, . . . , k so that the individuals 
selected in successive waves of depend only on the previously observed part 
of the graph, and not on the unobserved portions. The saturated link-tracing 
design is represented by k = oo. 

3. Inferential frameworks. In this section we consider two frameworks 
for inference based on sampled data. In the design-based framework y repre- 
sents the fixed population and interest focuses on characterizing y based on 
partial observation. The random variation considered is due to the sampling 
design alone. A key advantage of this approach is that it does not require 
a model for the data themselves, although a model may also be used to 
guide design-based inference [Sarndal, Swensson and Wretman (1992)]. Un- 
der the model-based framework, Y is stochastic and is a realization from a 
stochastic process depending on a parameter rj. Here interest focuses on rj 
which characterizes the mechanism that produced the complete network Y. 
We find severe limitations of the design-based framework for data from link- 
tracing samples, and focus on likelihood inference within the model-based 
framework. 
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3.1. Design-based inference for the network. In the design-based frame, 
the unobserved data values, or some functions thereof, are analogous to 
the parameters of interest in likelihood inference. The population of data 
values is treated as fixed, and all uncertainty in the estimates is due to the 
sampling design, which is typically assumed to be fully known (not just up 
to the parameter tp). 

Inference typically focuses on identifying design-unbiased estimators for 
quantities of interest measured on the complete network. In an undirected 
network analysis setting, for example, we can consider estimating r = J2i<j Viji 
the number of edges in the network. Note that y is a partially-observed ma- 
trix of constants in this setting. Then f is design-unbiased for r if 

®D[r\ip,y] =t, 

where the expectation is taken over realizations of the sampling process. 
Specifically, 

E D [r(Y ohB ,D)\i;,y] = f (y obs (d), d)P(D = d\^,y), 
dev 

where f(y Q ] :)S (d),d) is the estimator expressed as a function of the observed 
network information. Similarly, the variance of the estimator is computed 
with respect to the variation induced by the sampling procedure 

Y D [f(Y ohs ,D)\i;,y} = Y,(.r(yob s (.d),d)-T) 2 P(D = d\ij,y). 

The Horvitz-Thompson estimator is a classic tool of design-based in- 
ference, and is based on inverse-probability weighting the sample. In our 
example, it is 



f(Y ohs ,D) 



i<j:Dij=l 



TTij 



where the dyadic sampling probability 7Tj,- = P(Dij = l\ip,y) is the probabil- 
ity of observing dyad (i, j). 

Consider an estimator of r based on relations observed through the ego- 
centric design of Section 2.1.1. Then 

^ij = i - (i - 4>) 2 Vi,j. 

The classic Horvitz-Thompson estimator f of r then weights each observa- 
tion by the inverse of its sampling probability 

^ 7T i? "l-(l-^ 2 ^ 
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Then 

where 7Ty,fe/ = P(S 0i + S 0j > 0, S ok + So/ > 0) or 

7r,jj, i = k, j = I, 

KijiTkh i i {k, 1} and j $ {k, I}, 

^ijj^ — ?>il) 2 , otherwise. 

Among the many available estimators for the variance of the Horvitz-Thompson 
estimator is the Horvitz-Thompson variance estimator: 

^( f )= J2 E ^{[ 1 -( 1 -V') 2 ]~ 2 vr ii , fc i-l}y^- 

i<j:D lJ= lk<l:D kl = l ™ % i> kl 

Note the importance of the unit sampling probabilities in these estima- 
tors. This is a hallmark of design-based inference: inference relies on full 
knowledge of the sampling procedure in order to make unbiased inference 
without making assumptions about the distribution of the unobserved data. 
This typically requires knowledge of the sampling probability of each unit 
in the sample. This procedure is complicated in the network context, in that 
we require the sampling probabilities of the units of analysis, dyads, which 
are different from the units of sampling, nodes. In fact, for even single-wave 
link-tracing samples, the dyadic sampling probabilities are not observable. 

To see this, define the nodal neighborhood of a dyad N(i,j), where 

k € N(i,j) ^ {S ok = 1 D i:j = 1}. Then Try = P(3k :S ok = l,ke N(i,j)). 

For the one-wave link-tracing design of Section 2.1.2, N(i,j) = {k} : y^ = 
1 or yjk = 1 or k £ {i,j}- Then if the initial sample So is drawn according 
to the design in Section 2.1.2, ix^ = 1 — (1 — ifj)W N ( l '^W . Suppose Soi = 1, and 
Soj = 0. Then dyad is observed, but ||JV(i,j)|| is unknown because it 
is unknown which k satisfy yji- = 1. The link-tracing sampling structures 
for which nodal and dyadic sampling probabilities are observable are sum- 
marized in Table 1. For directed networks, we assume sampled nodes pro- 
vide information on their out-arcs only, so that D is not symmetric and 
D lj = l ^ Si = l. 

Of the designs considered here, dyadic sampling probabilities are observ- 
able only for ego-centric samples, and never for link-tracing designs. Nodal 
sampling probabilities are also observable for ego-centric sampling, as well 
as for one- wave and saturated link-tracing designs in undirected networks. 
Overall, this table presents strong limitations to the applicability of design- 
based methods requiring the knowledge of sampling probabilities to link- 
tracing designs. Note that this limitation is not specific to dyad-based net- 
work statistics. Estimation of triad-based network statistics such as a triad 
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Table 1 

Observable sampling probabilities under various sampling schemes for directed and 
undirected networks. Nodal and dyadic sampling probabilities are considered separately. 
"X" indicates observable sampling probabilities, while a blank indicates unobservable 

sampling probabilities 



Sampling 
scheme 


Nodal probabilities 7Ti 


Dyadic probabilities TTy 


Undirected 


Directed 


Undirected Directed 


Ego-centric 


X 


X 


X X 


One- wave 


X 






fc-wave, 1 < k < oo 








saturated 


X 







census would be subject to similar limitations. A Horvitz-Thompson style 
estimator would rely on a weighted sum of observed triads, weighted accord- 
ing to sampling probabilities. Sampling probabilities for triads would be even 
more complex, as they would typically require sampling of two of the three 
nodes involved in an undirected case, and at least two of the three nodes 
in an directed case, depending on the triad census. Both of these sampling 
probabilities would not be possible to compute for link-tracing samples in 
which the degrees or in-degrees of some involved nodes are unobserved. 

Not surprisingly, most of the work on design-based estimators for link- 
tracing samples has focused on the cases where sampling probabilities are 
observable: typically for one-wave or saturated samples used to estimate 
population means of nodal covariates. Prank (2005) presents a good overview 
and extensive citations to this literature. See also Thompson and Collins 
(2002); Snijders (1992). Although examples tend to focus on instances where 
sampling probabilities are observable, the limited applicability of classical 
design-based methods in estimating structural network features based on 
link-tracing samples has not been emphasized in the literature. 

In the absence of observable sampling probabilities, design-based infer- 
ence requires a mechanism for estimating sampling probabilities. This is 
most often necessary in the context of out-of-design missing data, and ad- 
dressed with approaches such as propensity scoring [Rosenbaum and Rubin 
(1983)], which rely on auxiliary information available for the full sampling 
frame to estimate unknown sampling probabilities. Link-tracing differs from 
the traditional context of such methods in that the sampling probabilities 
are unobserved even when the design is executed faithfully, and in that the 
unknown sampling probabilities result directly from the unobserved vari- 
able of interest. In particular, estimating unknown sampling probabilities is 
equivalent to estimating unobserved relations based on the observed rela- 
tions. One approach is to augment the sample with sufficient information 
to allow for determination of the sampling probabilities. However in most 
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cases, this requires a substantial expansion of the sampling design. There- 
fore, in practice we must rely on a model relating the observed portions of 
the network structure to the unobserved portions. Lack of reliance on an 
assumed outcome model is a great advantage of the design-based frame- 
work over the model-based framework. By introducing a model to estimate 
sampling probabilities based on the outcome of interest, we reintroduce this 
reliance on model form, negating much of the advantage of the design-based 
framework. Furthermore, note that the naive use of this approach has an 
ad-hoc flavor, while still requiring complex observation weights and variance 
estimators. 

In the next section, we describe an alternative more flexible likelihood 
approach to network inference based on link-tracing samples. 

3.2. Likelihood-based inference. Consider a parametric model for the ran- 
dom behavior of Y depending on a parameter p- vector rj: 

(3.1) P v (Y = y), r,eE. 

In the model-based framework, if Y is completely observed, inference for rj 
can be based on the likelihood 

L[r)\Y = y]<xP v (Y = y). 

This situation has been considered in detail in Hunter and Handcock (2006) 
and the references therein. In the general case, where Y may be only partially 
observed, we can consider using the (so-called) face-value likelihood based 
solely on y obs : 

(3.2) L[n\Y ohs = y ohs ] oc ^ P V (Y = y ohs + v). 

This ignores the additional information about ij available in D. Inference 
for rj and ip should be based on all the available observed data, including 
the sampling design information. This likelihood is any function of rj and ip 
proportional to P(D, l^bsl 7 ?) V0 : 

L[r), tp\Y ohs = y ohs , D = d ohs ] 

oc P(D = d bs,^obs = y bs\r],ip) 

= ^2 P(D = d ohs \Y = y ohs + v,ip)P v (Y = y ohs + v). 
vey(y ohs ) 

Thus the correct model is related to the complete data model through the 
sampling design as well as the observed nodes and dyads. 

In likelihood inference, the sampling parameter tp is a nuisance parameter, 
and modeling the sampling design along with the data structure adds a great 
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deal of complexity. It is natural to ask when we might consider the simpler 
face- value likelihood, (3.2), which ignores the sampling design. 

In the context of missing data, Rubin (1976) introduced the concept of 
ignorability to specify when inference based on the face-value likelihood is 
efficient. We introduce the term amenability to represent the notion of ig- 
norability for network sampling strategies within a likelihood framework. 

In many situations where models are used, the parameters 77 € S and 
tp € $f are distinct, in the sense that the joint parameter space of (77, ip) is 
$xS. If the sampling design is adaptive and the parameters 77 and tj; are 
distinct, 

L[n, Vl^obs = 2/obs, D = (i bs] 

cc P(D = d ohs \Y ohs = y ohs ,Tp) ^2 P v( Y = Vobs + v) 

(XL[lp\D = cUs,^obs = y bs] X-L [7? | Fobs = J/obs]- 

Thus if the sampling design is adaptive and the structural and sampling 
parameters are distinct, then the sampling design is ignorable in the sense 
that the resulting likelihoods are proportional. When this condition is satis- 
fied likelihood-based inference for 77, as proposed here, is unaffected by the 
(possibly unknown) sampling design. This leads to the following definition 
and result. 

Definition. Consider a sampling design governed by parameter ip € 
and a stochastic network model P V (Y = y) governed by parameter 77 £ S. 
We call the sampling design amenable to the model if the sampling design 
is adaptive and the parameters tjj and 77 are distinct. 

Result. Consider networks produced by the stochastic network model 
P^ (Y = y) governed by parameter 77 € S which are observed by a sampling 
design with parameter tp G \P amenable to the model. Then the likelihood 
for r] and ip is 

L[v, Tp\ Y obs = Vob s ,D = ^bs] oc L[ip\D = d ohs ,Y ohs = y oha ] xL[77|y obs = y oha ]. 

Thus likelihood-based inference for 77 from L[rj, ^{Y^, D] will be the same 
as likelihood-based inference for 77 based on L [77 1 lobs]- 

This result shows for standard designs such as the ego-centric, single wave 
and multi-wave sampling designs in Section 2, likelihood-based inference can 
be based on the face- value likelihood L [77 1 lobs]- This was first noted in the 
foundational paper of Thompson and Frank (2000). Explicitly, this is 

L[rj\Y bs =y bs] <xP(Y obs = yobs|7?) = ^ Pr,{Y = y ohs + v). 
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Hence we can evaluate the likelihood by just enumerating the full data like- 
lihood over all possible values for the missing data. 

We may also wish to make inference about the design parameter ip. The 
likelihood for tp based on the observed data is any function of ip proportional 
to P(D, | if}). For designs amenable to the model this is 

L[if}\D = d ohs ,Y ohs = y ohs ] oc P(D = d ohs \Y obs = 2/obs, VO 

= P(D = d obs \Y = y ohs + v, if)) 

for any choice of v in y(y bs)- Hence it can be computed without reference 
to the network model. 

4. Exponential family models for networks. The models we consider for 
the random behavior of Y rely on a p- vector g(Y) of statistics and a param- 
eter vector r\ € BP. The canonical exponential family model is 

(4.1) P V {Y = y) = exp{r] ■ g{y) - n(r})}, yey 

where exp{n(r])} = ^2 u£ y exp{rj ■ g(u)} is the familiar normalizing constant 
associated with an exponential family of distributions [Barndorff-Nielsen 
(1978); Lehmann (1983)]. 

The range of network statistics that might be included in the g(y) vec- 
tor is vast — see Wasserman and Faust (1994) for the most comprehensive 
treatment of these statistics — though we will consider only a few in this ar- 
ticle. We allow the vector g(y) to include covariate information about nodes 
or edges in the graph in addition to information derived directly from the 
matrix y itself. 

There has been a great deal of work on models of the form (4.1), to which 
we refer as exponential family random graph models or ERGMs for short. 
[We avoid the lengthier EFRGM, for "exponential family random graph 
models," both for the sake of brevity and because we consider some models 
in this article that should technically be called curved exponential families 
Hunter and Handcock (2006).] 

The normalizing constant is usually difficult to compute directly for y 
containing large numbers of networks. Inference for this class of models was 
considered in the seminal paper by Geyer and Thompson (1992), building on 
the methods of Frank and Strauss (1986) and the above cited papers. Until 
recently, inference for social network models has relied on 
maximum pseudolikelihood estimation [Besag (1974); Frank and Strauss 
(1986); Strauss and Ikeda (1990); Geyer and Thompson (1992)]. 
Geyer and Thompson (1992) proposed a stochastic algorithm to approxi- 
mate maximum likelihood estimates for model (4.1), among other models; 
this Markov chain Monte Carlo (MCMC) approach forms the basis of the 
method described in this article. The development of these methods for social 
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network data has been considered by Corander, Dahmstrdm and Dahmstrdm 
(1998); Crouch, Wasserman and Trachtenberg (1998); Snijders (2002); 
Handcock (2002); Corander, Dahmstrom and Dahmstrdm (2002); 
Hunter and Handcock (2006). 

4.1. Likelihood-based inference for ERGM. In this section we consider 
likelihood inference for n in the case where Y = Y obs + Y m i s is possibly only 
partially observed. 

As the direct computation of the likelihood is difficult when the number 
of networks in y is large, we can approximate the likelihood by using the 
MCMC approach of randomly sampling from the space of possible values of 
the missing data and taking the mean. Alternatively, consider the conditional 
distribution of Y given Y^bs^ 

P v (Y mis = v\Y ohs = y obs ) = exp[?? • g(v + y ohs ) - n(r)\y ohs )], v G y(y ohs ), 

where exp[K(r/|?/ obs )] = J2 u ey(y ohB ) ex P[ ? ? ' ffC" + S/obs)]- This formula gives a 
simple way to sample from the conditional distribution and hence produce 
multiple imputations of the full data. Specifically, the conditional distribu- 
tion of Y given Y^ is an ERGM on a constrained space of networks, and 
hence one can simulate from it using a variant of the standard MCMC for 
ERGM [Hunter and Handcock (2006); Handcock et al. (2003)] that restricts 
the proposed networks to the subset of networks that are concordant to the 
observed data. 
Also note that 



which can then be estimated by MCMC samples: the first term by a chain 
on the complete data and the second by a chain conditional on y bs- So the 
sampled data situation is only slightly more difficult than the complete data 
case. 

5. Two-wave link-tracing samples from a collaboration network. In this 
section we investigate the effect of network sampling on estimation by com- 
paring network samples to the situation where we observe the complete net- 
work. Specifically, we consider the collaborative working relations between 
36 partners in a New England law firm introduced in Section 1. These data 
have been studied by many authors including Lazega (2001), Snijders et al. 
(2006) and Hunter and Handcock (2006) (whom we follow). 

We consider an ERGM (4.1) with two network statistics for the direct 
effects of seniority and practice of the form 



L[v\ Y obs = yobs] oc exp[K(r?|y obs ) - n(r])} 




l<i ,j<n 
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where Xj is the seniority or practice of partner i. We also consider three 
dyadic homophily attributes based on practice, gender and office. These 
are included as three network statistics indicating matches between the two 
partners in the dyad on the given attribute: 

^2 yijZ(Xi = Xj), 

l<i<j<n 

where I(x) indicates the truth of the condition x and Xi and Xa are the 
practice, gender or office attribute of partner i and j, respectively. We also 
include statistics that are purely functions of the relations y. These are the 
number of edges (essentially the density) and the geometrically weighted 
edgewise shared partner statistic (denoted by GWESP), a measure of the 
transitivity structure in the network [Snijders et al. (2006)]. The model is 
a slightly reparameterized form of Model 2 in Hunter and Handcock (2006) 
obtained by replacing the alternating fc-triangle term with the GWESP term. 
The scale parameter for the GWESP term is fixed at its optimal value 
(0.7781). See Hunter and Handcock (2006) for details. 

As discussed in Hunter and Handcock (2006), this model provides an ad- 
equate fit to the data, and we will use it here to assess the effect of sampling 
on model fit. A summary of the MLE parameters used is given in the com- 
plete data value column of Table 2. Note that we are taking these parameters 
as "truth" and considering data produced by sampling from this network. 

We construct all possible datasets produced by a two-wave link-tracing 
design starting from two randomly chosen nodes (the "seeds"). This adap- 
tive design is amenable to the model. As there are 36 partners and the 

Table 2 

Bias and Root Mean Squared Error (RMSE) of natural parameter MLE 
based on two-wave samples as percentages of true parameter values and 

efficiency losses 



Natural 


Complete 


Bias 


RMSE 


Efficiency 


parameter 


data value 


(%) 


(%) 


loss (%) 


Structural 










Edges 


-6.51 


0.2 


1.2 


1.7 


GWESP 


0.90 


0.8 


3.7 


5.1 


Nodal 










Seniority 


0.85 


0.3 


3.1 


1.3 


Practice 


0.41 


0.4 


5.3 


3.5 


Homophily 










Practice 


0.76 


0.8 


4.3 


2.9 


Gender 


0.70 


0.9 


4.7 


1.7 


Office 


1.15 


0.7 


2.9 


2.8 
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Sampled 



Edges between 
Not Sampled 
and Sampled 

(Observed) 



Not Sampled 



NT 



Edges 
between 
Sampled and 
Sampled 

(Observed) 



Not Sampled 



Edges between 
Sampled and 
Not Sampled 

(Observed) 



Edges between 
Not Sampled 
and Not 
Sampled 

(Unobserved) 



Fig. 1. Schematic depiction of sampled and unobserved arc data when the sampling is 
over an undirected network. 



sample is deterministic given the seeds, there are ( 3 2 6 ) = 630 possible data 
sets. The number of actors in each dataset varies from just 2 to all 36 de- 
pending on the degree of connectedness of the seeds. The data pattern is 
shown in Figure 1. Consider a partition of the sampled from the nonsam- 
pled and the corresponding 2x2 blocking of the sociomatrix, with the four 
blocks representing dyads from sampled and nonsampled to sampled and 
nonsampled. The complete data consists of the full sociomatrix. The first 
three blocks contain the observed data, the dyads involving at least one sam- 
pled node, and the last block contains the unobserved data, those between 
the nonsampled. 

For each of these samples we use the methods of Section 4.1 to esti- 
mate the parameters. We can then compare them to the MLE for the com- 
plete dataset. For these networks, the MLEs are obtained using statnet 
[Handcock et al. (2003)], both for the natural parametrization and for the 
mean value parameterization [see Handcock (2003)]. 

The mean value parameters are a function of the natural parameters, 
specifically the expected values of the sufficient statistics given the values of 
the natural parameters. 

There are two isolates, that is nodes with no relations. If these two are se- 
lected as the two seeds, only 69 of the 630 dyads are observed, and no edges 
are observed. Therefore, the MLE associated with this sample includes (neg- 
ative) infinite values, on the boundary of the convex hull. For this reason, 
we exclude this sample from our analyses. Practically, this exclusion is rea- 
sonable in that it is unlikely any researcher drawing a link-tracing sample 
including only two isolated nodes will proceed with analysis of that sample. 

One way to assess the effect of the link-tracing design is to compare the 
estimates from the sampled data to that of the complete data. As a measure 
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of the difference between the estimates in the metric of the model, we use the 
Kullback-Leibler divergence from the model implied by the complete data 
estimate to that of the sampled data estimate. Recall that the Kullback- 
Leibler divergence of a distribution with probability mass function p from 
the distribution with probability mass function q is 

E q [hg(q) -log(p)]. 

Let r\ and £ be alternative parameters for the model (4.1). The Kullback- 
Leibler divergence, KL(£, 77), of the ERGM with parameter rj from the ERGM 
with parameter £ is 

= J> - v) ■ yPd Y = y) + <v) - <0 

= (Z-v)-Et\g(Y)] + K (T,)- K (S). 

If £ is the complete data MLE then E^[g(Y)] = g(l^bs) are the observed 
statistics (given in the complete data value column of Table 3). The diver- 
gence can be easily computed using the MCMC algorithms of Section 4.1. 

Figure 2 plots the Kullback-Leibler divergence of the MLEs based on the 
629 samples from the complete data MLE. The Kullback-Leibler divergence 
of the two smallest samples, including only 5 nodes (165 dyads), are about 
14 and have not been plotted to reduce the vertical scale. The horizontal 
axis is the number of observed dyads in the sample. The plot indicates 

Table 3 

Bias and Root Mean Squared Error (RMSE) of mean value parameter MLE based on 
two-wave samples as percentages of true parameter values and efficiencies 



Natural 


Complete 


Bias 


RMSE 


Efficiency 


parameter 


data value 


(%) 


(%) 


loss (%) 


Structural 










Edges 


115.00 


0.4 


2.0 


1.8 


GWESP 


190.31 


0.4 


2.8 


1.9 


Nodal 










Seniority 


130.19 


0.3 


1.8 


1.4 


Practice 


129.00 


0.2 


2.6 


3.4 


Homophily 










Practice 


72.00 


0.1 


2.0 


1.7 


Gender 


99.00 


0.5 


2.1 


1.8 


Office 


85.00 


0.7 


2.7 


3.0 
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450 500 550 600 

Observed dyads 



Fig. 2. Kullback-Leibler divergence of the MLEs based on the samples compared to the 
complete data MLE. As the number of dyads sampled increases, the information content 
of the samples approaches that of the complete data. The information loss for the majority 
of samples is modest. 



how the information in the data about the complete data MLE approaches 
that of the complete data as the number of sampled dyads approaches the 
full number. The key feature of this figure is the variation in information 
content among samples of the same size especially for the smaller sample 
sizes. Different seeds lead to samples that tell us different things about the 
model even when the numbers of partners surveyed is the same. 

For more specific information on the individual estimates, we can com- 
pute the bias of the estimates based on the samples as the mean differ- 
ence between the parameter estimates from the samples and that of the 
complete network. The root mean squared error (RMSE) is the square- 
root of the mean of the squared difference between the parameter esti- 
mates from each sample and the complete data estimates. The efficiency 
loss of the sampled estimate is the ratio of the mean squared error and 
the variance of the sampling distribution of the estimate based on the full 
data. This standardizes the error in the sampled estimates by the variation 
in the complete data estimates. We also complete a similar comparison of 
the estimates under the alternative mean value parametrization [Handcock 
(2003)]. 

The properties of the natural parameter estimates are summarized in 
Table 2. The bias and root mean squared error are presented in percentages 
of the complete data parameter estimates. 
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The bias is very small and the RMSE is modest. The efficiency loss is 
2%-3% on average. Note that these population-average figures obscure the 
variation in loss over individual samples apparent in Figure 2. 

Table 3 is the mean value parameterization analog of Table 2. As these are 
on the same measurement scale as the statistics they are easier to interpret. 
Again we see that the estimates are approximately unbiased and the RMSE 
and efficiency losses are small. 

6. Discussion. In this paper we give a concise and systematic statistical 
framework for dealing with partially observed network data resulting from a 
designed sample. The framework includes, but is not restricted to, adaptive 
network sampling designs. We present a definition of a network design which 
is amenable to a given model and a result on likelihood-based inference under 
such designs. 

An important simple result of this framework is that sampled networks 
are not "biased" but can be representative if analyzed correctly. Many au- 
thors have confused the ideas of simple random sampling of the dyads with 
representative designs. The results of this paper indicate that simple random 
sampling is not necessary for valid inference. In fact, the most commonly 
used designs can be easily taken into account. Hence, despite their form, 
inference from adaptive network samples is tractable. 

It is illustrative to compare our approach to that of Stumpf, Wiuf and May 
(2005). These authors highlight the difference between the structure of a net- 
work and that of a sub-network induced by Bernoulli sampling of its nodes. 
The framework in this paper allows valid inference for the properties of the 
network based on its partial observation. This is because we fit a broad 
class of models compatible with an arbitrary set of network statistics (e.g., 
ERGM) for the complete network and use a method of inference that does 
not rely on equality between the structure of the full and sub-networks. As 
illustrated by the work of Stumpf, Wiuf and May (2005), treating the ob- 
served portion as if it were the full network may lead to invalid inference 
about characteristics of the full network such as the degree distribution. 

We have also shown that likelihood-based inference from an adaptive net- 
work sample can be conducted using a complete network model. We have 
shown that such inference is both principled and practical. The likelihood 
framework naturally accommodates standard sampling designs. Note that 
in a design-based frame, principled inference would require a great deal of 
effort to precisely characterize the sampling designs. The result that link- 
tracing designs are adaptive and can be analyzed with complex likelihood 
based methods is very valuable in practice as these designs have previously 
not been analyzed with general exponential family random graph (or simi- 
lar) models. The only prior work appears to be that of Thompson and Frank 
(2000) who applied a less complex model class. 
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In our application we show that an adaptive network sampling of a col- 
laboration network can lead to effective estimates of the model parameters 
in the vast majority of cases. We find that the MLEs from the samples 
have only modest bias (compared to the complete data estimate) and an 
error that only increases slowly with the number of unobserved dyads. We 
also show that the information content of the sample (with respect to the 
model), varies greatly even for samples of the same size. For conventional 
samples of i.i.d. random variables, the Fisher information is simply propor- 
tional to the sample size. In the network setting with dependence terms, 
however, the Fisher information will depend on the specific set of nodes and 
dyads sampled. For example, the information component corresponding to 
the GWESP term in the example will be larger for samples in which more 
pairs of nodes joined by edges are sampled, as GWESP applies only to pairs 
of nodes joined by edges. If no such dyads were sampled, there would be no 
information in the sample about the propensity for nodes joined by edges 
to have relations in common. 

In practice the sample is a result of a combination of the sampling de- 
sign and an out- of- design mechanism. The sampling design is the part of 
the observation process under the control of the surveyor. When adaptive 
designs are executed faithfully, the unknown dyads are assumed to be in- 
tentionally unobserved, or missing by design. Note that the definition of 
control may be extended to nonamenable sampling designs, for example by 
allowing the design to depend on unknown factors, such as the unrecorded 
values of variables used for stratification. The out-of-design mechanism is 
the nonintentional nonobservation of network information (e.g., due to the 
failure to report links, incomplete measurement of links and attrition from 
longitudinal surveys). This is also referred to, in general, as the non-response 
mechanism. We consider the joint effect of sampling and missing data in a 
companion paper [Handcock and Gile (2007)]. 
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SUPPLEMENTARY MATERIAL 

Supplement: Software used in the simulation study 

(DOI: 10.1214/08-AOAS221SUPP; .zip). The code used to perform this 
study is written in the R statistical language [R Development Core Team 
(2007)] and is based on statnet, an open-source software suite for network 
modeling [Handcock et al. (2003)]. We provide the code and documentation 
for it with links to the statnet website. 



MODELING SOCIAL NETWORKS FROM SAMPLED DATA 21 



REFERENCES 

Barndorff-Nielsen, O. E. (1978). Information and Exponential Families in Statistical 
Theory. Wiley, New York. MR0489333 

Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with 
discussion). J. Roy. Statist. Soc. Ser. B 36 192-236. MR0373208 

Corander, J., Dahmstrom, K. and Dahmstrom, P. (1998). Maximum likelihood esti- 
mation for Markov graphs. Research report, Dept. Statistics, Univ. Stockholm. 

Corander, J., Dahmstrom, K. and Dahmstrom, P. (2002). Maximum likelihood es- 
timation for exponential random graph models. In Contributions to Social Network 
Analysis, Information Theory, and Other Topics in Statistics; A Festschrift in Honour 
of Ove Frank (J. Hagberg, ed.) 1-17. Dept. Statistics, Univ. Stockholm. 

Crouch, B., Wasserman, S. and Trachtenberg, F. (1998). Markov chain Monte Carlo 
maximum likelihood estimation for p* social network models. In The XVIII Interna- 
tional Sunbelt Social Network Conference, Sitges, Spain. 

Frank, O. (2005). Network Sampling and Model Fitting. In Models and Methods in Social 
Network Analysis (J. S. P. Carrington and S. S. Wasserman, eds.) 31-56. Cambridge 
Univ. Press, Cambridge. 

Frank, O. and Strauss, D. (1986). Markov Graphs. J. Amer. Statist. Assoc. 81 832-842. 
MR0860518 

Geyer, C. J. and Thompson, E. A. (1992). Constrained Monte Carlo maximum likeli- 
hood calculations (with discussion). J. Roy. Statist. Soc. Ser. B 54 657-699. MR1185217 

Handcock, M. S. (2002). Degeneracy and inference for social network models. In The 
Sunbelt XXII International Social Network Conference, New Orleans, LA. 

Handcock, M. S. (2003). Assessing degeneracy in statistical models of social networks. 
Working paper 39, Center for Statistics and the Social Sciences, Univ. Washington. 
Available at http://www.csss.washington.edu/Papers. 

Handcock, M. S. and Gile, K. J. (2007). Modeling social networks with sampled or 
missing data. Working paper 75, Center for Statistics and the Social Sciences, Univ. 
Washington. Available at http://www.csss.washington.edu/Papers. 

Handcock, M. S. and Gile, K. J. (2010). Supplement to "Modeling social networks 
from sampled data." DOI: 10.1214/08- AOAS221SUPP. 

Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M. and Morris, 
M. (2003). statnet: Software tools for the statistical modeling of network data stat- 
net project http://statnet.org/, Seattle, WA. R package version 2.0. Available at 
http : // CRAN . R-pro j ect . org/package=statnet . 

Hunter, D. R. and Handcock, M. S. (2006). Inference in curved exponential family 
models for networks. J. Comput. Graph. Statist. 15 565-583. MR2291264 

Lazega, E. (2001). The Collegia! Phenomenon: The Social Mechanisms of Cooperation 
Among Peers in a Corporate Law Partnership. Oxford Univ. Press, Oxford. 

Lehmann, E. L. (1983). Theory of Point Estimation. Wiley, New York, NY. MR0702834 

R Development Core Team (2007). R: A language and environment for statistical 
computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051- 
07-0, Version 2.6.1. Available at http://www.R-project.org/. 

Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in 
observational studies for causal effects. Biometrika 70 41-55. MR0742974 

Rubin, D. B. (1976). Inference and missing data. Biometrika 63 581-592. MR0455196 

Sarndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sam- 
pling. Springer, New York. MR1140409 

Snijders, T. A. B. (1992). Estimation on the basis of snowball samples: How to weight. 
Bulletin Methodologie Sociologique 36 59-70. 



22 



M. S. HANDCOCK AND K. J. GILE 



Snijders, T. A. B. (2002). Markov chain Monte Carlo estimation of exponential random 

graph models. Journal of Social Structure 3 1-41. 
Snijders, T. A. B., Pattison, P., Robins, G. L. and Handcock, M. S. (2006). New 

specifications for exponential random graph models. Sociological Methodology 36 99- 

153. 

Strauss, D. and Ikeda, M. (1990). Pseudolikelihood estimation for social networks. J. 

Amer. Statist. Assoc. 85 204-212. MR1137368 
Stumpf, M. P. H., Wiuf, C. and May, R. M. (2005). Subnets of scale-free networks 

are not scale-free: Sampling properties of networks. Proc. Natl. Acad. Sci. USA 102 

4221-4224. 

Thompson, S. K. and Collins, L. M. (2002). Adaptive sampling in research on risk- 
related behaviors. Drug and Alcohol Dependence 68 S57-S67. 

Thompson, S. K. and Frank, O. (2000). Model-based estimation with link-tracing sam- 
pling designs. Survey Methodology 26 87-98. 

Thompson, S. K. and Seber, G. A. F. (1996). Adaptive Sampling. Wiley, New York. 
MR1390995 

Wasserman, S. and Faust, K. (1994). Social Network Analysis: Methods and Applica- 
tions. Cambridge Univ. Press. 



Department of Statistics 

University of California 

Los Angeles, California 90095-1554 

USA 

E-MAIL: handcock@stat.washington.edu 



Nuffield College 
University of Oxford 
New Road 
Oxford 0X1 INF 
United Kingdom 

E-MAIL: krista.gilc@nufficld.ox.ac.uk 



